A Partial-Repeatability Approach to Data Mining

نویسندگان

  • Kai-Yuan Cai
  • Yunfei Yin
  • Shichao Zhang
چکیده

Unlike the data approached in traditional data mining activities, software data are featured with partial-repeatability or parepeatics, which is an invariant property that can neither be proved in mathematics nor validated to a high accuracy in physics, but still (partially) governs the behavior of the data. Parepeatics emerges as a result of the inaccurate universe. The universe comprises all possible C language programs is an example that cannot be accurately characterized since human writes defect-prone programs. In this paper we design a parepeatic mining framework for software data diming, where the mined knowledge is represented in terms of parepeatic models. A parepeatic model consists of central knowledge, a knowledge fluctuation zone and a correctness factor. Our approach can generate the required parepeatic model as a new form of knowledge representation from a given dataset and apply it to software data mining. Experimental results with real C language programs show that the proposed approach is effective.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Customer Retention Based on the Number of Purchase: A Data Mining Approach

Purpose: this study wants to find any relationship between the numbers of purchase and the income the customer brings to the company. The attempt is to find those customers who buy more than one life insurance policy and represent the signs of good payments at the same time by the help of data mining tools. Design/ methodology/ approach: the approach of this research is to use data mining tools...

متن کامل

a swift heuristic algorithm base on data mining approach for the Periodic Vehicle Routing Problem: data mining approach

periodic vehicle routing problem focuses on establishing a plan of visits to clients over a given time horizon so as to satisfy some service level while optimizing the routes used in each time period. This paper presents a new effective heuristic algorithm based on data mining tools for periodic vehicle routing problem (PVRP). The related results of proposed algorithm are compared with the resu...

متن کامل

SIGMOD RWE Review ”Towards Proximity Pattern Mining in Large Graphs”

This document is a review report on the paper ”Towards Proximity Pattern Mining in Large Graphs” by A. Khan, X. Yen, and K. Wu by Sigmod’s 2010 Repeatability and Workability Evaluation Committee. In this section the provided resources (code, data sets, setup information) and hardware setups of the authors and reviewers are discussed. Detailed information on all experiments that the review condu...

متن کامل

Development of a Unique Biometric-based Cryptographic Key Generation with Repeatability using Brain Signals

Network security is very important when sending confidential data through the network. Cryptography is the science of hiding information, and a combination of cryptography solutions with cognitive science starts a new branch called cognitive cryptography that guarantee the confidentiality and integrity of the data. Brain signals as a biometric indicator can convert to a binary code which can be...

متن کامل

A new approach based on data envelopment analysis with double frontiers for ranking the discovered rules from data mining

Data envelopment analysis (DEA) is a relatively new data oriented approach to evaluate performance of a set of peer entities called decision-making units (DMUs) that convert multiple inputs into multiple outputs. Within a relative limited period, DEA has been converted into a strong quantitative and analytical tool to measure and evaluate performance. In an article written by Toloo et al. (2009...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IEEE Intelligent Informatics Bulletin

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2005